Goto

Collaborating Authors

 dynamic incentive-aware learning


Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Neural Information Processing Systems

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations \new{for} an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy.


Reviews: Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Neural Information Processing Systems

The authors study the problem of setting (individual) reserve prices in a scenario of repeated contextual second-price auctions. The buyers are assumed strategic, i.e. they optimize a cumulative discounted utility, where their valuations are linear functions of the feature vector of a good. The considered scenario explicitly assumes existence of noise in the market. The seller's goal is to find an algorithm for setting prices that has sub-linear regret. Two algorithms are proposed: - the first one attain O(d log(Td) log(T)) regret bound, when the market noise distribution is known to the seller.


Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Neural Information Processing Systems

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations ew{for} an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy.


Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Golrezaei, Negin, Javanmard, Adel, Mirrokni, Vahab

Neural Information Processing Systems

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations ew{for} an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy.


Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions

Golrezaei, Negin, Javanmard, Adel, Mirrokni, Vahab

arXiv.org Machine Learning

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations for an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy. We propose learning policies that are robust to such strategic behavior. These policies use the outcomes of the auctions, rather than the submitted bids, to estimate the preferences while controlling the long-term effect of the outcome of each auction on the future reserve prices. When the market noise distribution is known to the seller, we propose a policy called Contextual Robust Pricing (CORP) that achieves a T-period regret of $O(d\log(Td) \log (T))$, where $d$ is the dimension of {the} contextual information. When the market noise distribution is unknown to the seller, we propose two policies whose regrets are sublinear in $T$.